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^ ' Abstract In this paper we study canonical 7-structures, a class of RNA pseudoknot 

D . structures that plays a key role in the context of polynomial time folding of RNA 

' pseudoknot structures. A 7-structure is composed by specific building blocks, that have 

topological genus less than or equal to 7, where composition means concatenation and 
nesting of such blocks. Our main result is the derivation of the generating function of 
7-structures via symbolic enumeration. 7-structures are constructed via 7-matchings. 
We compute an algebraic equation for the generating function of these matchings and 
prove that it is the unique solution. For 7 = 1 and 7 = 2 we compute the Puiseux- 
■ expansion of this power series at its unique, dominant singularity. This allows us to 
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derive simple asymptotic formulas for the number of 1-structures and 2-structures. 
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1 Introduction and background 



^ ' An RNA sequence is a linear, oriented sequence of the nucleotides (bases) A,U,G,C. 

These sequences "fold" by establishing bonds between pairs of nucleotides. These bonds 
cannot form arbitrarily, a nucleotide can at most establish one bond and the global 
conformation of an RNA molecule is determined by topological constraints encoded 
at the level o f seco ndary structure, i.e., by the mutual arrangements of the base pairs 
[Bailor et al. Secondary struc tures can be interpret ed as (partial) matchings 



in a graph of permissible base pairs Fl^baska et al.l ( 1998al ). When represented as a 
diagram, i.e. as a graph whose vertices are drawn on a horizontal line with arcs in the 
upper halfplane on refers to a secondary structure with crossing arcs as a pseudoknot 
structure. 

, Folded configurations are energetically optimal. Here energy means free energy, 

}_( ' which is dominated by the stacking of adjacent base pairs and not by the hydrogen 
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bonds of t he individual base pairslMathews et al ] (|l999h . as well as minimum arc-length 
conditions ISmith and WatermanI (| 19781 '). That" is, a stack is tantamount to a sequence 
of parallel arcs (i + 1, j — 1), . . . , (i + r, j — r)). In particular, only configura- 

tions without isolated bonds and without bonds of length one (formed by immediately 
subsequent nucleotides) are observed in RNA structures. For a given RNA sequence 
polynomial-time dynamic programming (DP) algorithms can be devised, finding such 
minimal energy configurations. 



Th e topological classification of RNA structures iBon et al.l ( 20081 ) : lAndersen et al.l 
(l2011a ) has r e cently been translated into an efficient dynamic programming algorithm 
iReidvs et al.l (|201lh . This algorithm a prion folds into a novel class of pseudoknot 
structures, the 7-structures. 7-structures differ from pseudoknotted R NA structures of 
fixed topol ogical genus of a n associated fatgraph or double line graph [Orland and Zed 
(|2002i ) and iBon et aD (|2008l ) , since they have arbitrarily high genus. They are composed 
by irreducible subdiagrams whose individual genus is bounded by 7 and contain no 
bonds of length one (1-arcs), see Section [5] for details. The f olding of 7-structure s has 
led to unprecidented sensitivity and positive predictive value iReidvs et al.l ( 20111 ) . 



In lNebel and Weinberd (|2011ah Nebel and Weinberg study a plethora of RNA struc- 
tures appearing in the context of DP folding routines by means of multiple context-free 
grammars. From these grammars algebraic equations are devised for the generating 
functions of the corresponding structure classes. The authors study the case 7 = 1 and 
find that in the limit of large n there are ji {g^^y" , 1-stru ctures, where Q^] = 
3.8782 and ji is some positive constant. The presentation in iNebel and Weinber j 
(|2011al ) however has its main focus on other aspects of pseudoknots (like e.g. a frame- 
work for their classification and comparison). Accordingly, the authors only sketch 
the way asymptotics for the number of different structures has been computed, nei- 
ther discussing irreducibility of polynomial equations nor uniqueness of power series 
solutions. 



In this paper we study canonical 7-structures, i.e. partial matchings composed 
by irreducible motifs of genus < 7, without isolated arcs and 1-arcs. Due to the ex- 
tended stacking of arcs canonical 7-structures are realistic folding targets of minimum 
free energy DP- algorithms. We identify a polynomial P'y{u,X), whose unique solu- 
tion equals the generating function of 7-matchings. We then have a closer look at 
the cases 7 = 1,2 and prove that Pi{u,X) and P2{u,X) are irreducible. This fact 
is of importance for interpreting the generating function of 7-matchings at its unique 
dominant singularity as a Puiseux-serie s, which in turn implies, by means of transfer 
theorems iFlaiolet and Sedgewickl (|2009l ). simple asymptotic formulas for the numbers 
of 7-matchings. 



7-matchings are the stepping stone to derive via Lemma |4] the further refined, 
bivariate generating function of 7-shapes, i.e. 7-matchings containing only stacks com- 
posed by a single arc. This generating function keeps additionally track of the 1-arcs, 
that are vital for the later inflation into 7-structures. We then compute the generating 
function of r-canonical 7-structures inflating 7-shapes by means of symbolic enumer- 
ation. In the process we rediscover and generalize Nebel and Weinberg's formula to 
arbitrary 7. 
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Fig. 1 (a) a diagram containing a rainbow (bold), three staeks (({5, 9), (6, 8)), 
((10, 15), (11, 14), (12, 13))), and ((1, 19), (2, 18), (3, 17), (4, 16). (b) the maximal arcs of a dia- 
gram displayed in (bold). 




<T-intervals 



Fig. 2 cr- and P-intervals. 

2 Some basic facts 

2.1 7-diagrams 

A diagram is a labeled graph over the vertex set [n] = {1, . . . , n} in which each vertex 
has degree < 3, represented by drawing its vertices in a horizontal line. The backbone 
of a diagram is the sequence of consecutive integers (1, . . . , n) together with the edges 
+ 1} I 1 < i < 71 — 1}. The arcs of a diagram, where i < j, are drawn in 

the upper half-plane. We shall distinguish the backbone edge {i, i + 1} from the arc 
{i, i + 1), which we refer to as a 1-arc. 

A stack of length r is a maximal sequence of "parallel" arcs, 

A stack of length > t is called a r-canonical stack, i.e. a stack of length zero is an 
isolated arc. The particular arc (l,n) is called a rainbow and an arc is called maximal 
if it is maximal with respect to the partial order < iff i' < i A j < j' , see 

Fig.m 

A stack of length r, ((i, j), (i + 1, j — 1), . . . , (i + r, j — r)) induces a sequence of pairs 
+ - + l,i + 2],[j - l,j -2]) . . .). We call any of these 2t intervals 

a P- interval. The interval [i + T,j — r] is called a tj-interval, see Fig. [2] 
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We shall consider diagrams as fatgraphs, G, that is graphs G together with a col- 
lection of cyclic orderings, called fattenings, one such ordering on the half-edges incident 
on eac h vertex. Each f atgraph G determines an oriented surface J^fGl lLoebl and Moffatj 
( 20081 ) : |Penner et all (|2010bl ) which is connected if G is and has some associated genus 
g{G) > and number r(G) > 1 of bo undary components. Clearly, F{G) contains G as 



MassevI (Il967l) . Fatgr aphs were fir s t app lied to RNA secondary 



a deformatio n retract I 

structures in IPenner and Watermanf (|l993al ') and lPenneij (|2004l '). 

A diagram G hence determines a unique surface F(G) (with boundary). Filling 
the boundary components with discs we can pass from _F(G) to a surface without 
boundary. Euler characteristic, X) s,nd genus, g, of this surface is given hyx = v — e + r 
and g — I — r espectively, wh ere v, e, r is the number of discs, ribbons and boundary 
components in G. lMassevI 1 1967h . The genus of a diagram is that of its associated surface 
without boundary. 

The shadow of a diagram of genus g is obtained by removing all noncrossing arcs, 
deleting all isolated vertices and collapsing all induced stacks (i.e., maximal subsets of 
subsequent, parallel arcs) to single arcs, see Fig. [3] 

The shadow of a diagram G, f(G), can possibly be empty. Furthermore, projecting 
into the shadow does not affect genus. Any shadow of genus g over one backbone 
contains at least 2g and at mos t (6g — 2) arcs. In p articular, for fixed genu s g, there 
exist only finitely many shadows iReidvs et al.l ( 20 llh : [Andersen et al.l (|2011bh . In Fig.U 
we display the four shadows of genus one. 

We denote shadows by a. 

A diagram is called irreducible, if and only if for any two arcs, ai, a^. contained in 
E, there exists a sequence of arcs (ai, «2, ■ • ■ , Qfc-li Q^fc) such that (ai^aj+i) are cross- 
i ng. Ir reducibility is equivalent to the concep t of primitivity introduced by Bon et aP 
( 20081 '). inspired by the work of lOvsonl (| 19491 '). According to lAndersen et al.l (|2011bh , 
for arbitrary genus g and 2g < I < {6g — 2) , there exists an irreducible shadow of genus 



5 




cut and glue cut and glue 



Fig. 5 A diagram G is decomposed: we remove any noncrossing arcs and isolated points, 
collapse any stacks into a single arcs and finally remove irreducible G-shadows from bottom 
to top and collapsing any stack generated in the process into a single arc. 

g having exactly £ arcs. We may reuse Fig. Uas an illustration of this result since the 
four shadows of genus one are all irreducible. 

Let ig{m) denote the number of irreducible shadows of genus g with m arcs. Since 
for fixed genus g there exist only finitely many shadows we have the generating poly- 
nomial of irreducible shadows of genus g 

6g-2 

'm—2g 

For instance for genus 1 and 2 we have 

Il(2) = 

12(2) = 172" + 1602^ + 5662^ + 10042^ + 9612^ + 4762^ + 962^°. 

The shadow cr(G) of a diagram G decomposes into a set of irreducible shadows. 
We shall call these shadows irreducible G-shadows. 

Any diagram G can iteratively be decomposed by first removing all noncrossing 
arcs as well as isolated vertices, second collapsing any stacks and third by removing 
irreducible G-shadows iteratively as follows, see Fig. [5] 

• one removes (i.e. cuts the backbone at two points and after removal merges the 
cut-points) irreducible G-shadows from bottom to top, i.e. such that there exists no 
irreducible G-shadow that is nested within the one previously removed. 

• if the removal of an irreducible G-shadow induces the formation of a stack, it is 
collapsed into a single arc. 

A diagram, G, is a 7-diagram if and only if for any irreducible G-shadow, G', 
g(G') < 7 holds. 

We denote the set of r-canonical 7-diagrams by Sr,7. Such a diagram without arcs 
of the form (i, i + 1) (1-arcs) is called a r-canonical 7-structure and their set is denoted 
by St,7- The set of 7-diagrams that contain only vertices of degree three (7-matchings) 
is denoted by M^, and the set of 7-matchings that contain only stacks of length zero 
(7-shapes) is denoted by S^. 
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2.2 Some generating functions 

In this paper we denote the ring of polynomials over a ring R by R[X] and the ring of 
formal power series X/n>o '^nX^ by -R[[^]] is a local ring with maximal ideal 

(X), i.e. an y power series with nonzero constant term is invertible. A Puiseux series 
IWalll l|2004l l is power series in fractional powers of X, i.e. X/ti>o o-nX^^^ for some fixed 
fc £ N. 

We denote the generating functions of a set of diagrams D filtered by the number 
of arcs D(z) = X]2n>0 Similarly, a generating function of diagrams filtered 

by the length of the backbone is written as 0(2) = X]n>o d(n)2". In particular, the 
generating functions of 7-matchings and r canonical 7-structures are given by 

H7(m) = ^ h.y{n)u^ , Gr,'i{z) = gT,7(n)z". 

2n>0 n>0 

Let '}i^{n,irL) D S^y (n,m) denote the collections of all 7-matchings and 7-shapes on 
2n > vertices containing ?n > 1-arcs with generating functions 

m,2Ti>0 rn,2ri>0 

where h-y(n, m) = s-y(n, m) = if 27 > n or if m > n. 

Furthermore there is a natural projection i? from 7-matchings to 7-shapes defined 
by collapsing each non-empty stack onto a single arc 

-ff: 'K^ S-y, 

which is surjective and preserves irreducible shadows as well as the number of 1-arcs. 
i9 restricts to a surjection 

1? : U„>oJC^(n, m) U„>oS^(7i, m), 

which collapses each stack to an arc and preserves any irreducible shadow and also the 
number m of 1-arcs. 



3 Combinatorics of 7-matchings 

In this section we study 7-matchings. 

Theorem 1 Let R = 'L\ij\. Then the following assertions hold: 

(a) the generating function of j-matchings, H-y(u), satisfies 

\ g<i 

In particular, there exists a polynomial P-y{u,X) £ R[X] of degree (I27 — 2), whose 
coefficients are sums ofIg{z) coefficients, such that P-y(it, H'y(it)) — 0. 

(b) eg. |[7P determines il^{u) uniquely. 



itH^(it) 
— uH^(m) 



(1) 
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P-interval 



Fig. 6 First, a fixed irreducible shadow a is inflated into a £o--diagram, second we pass to an 
3'o--diagram by inserting a nontrivial i^-diagram in one of tfic P-intervals. 



Proof We first prove (a). Let ct be a fixed irreducible shadow of genus g having m arcs. 
Let V(j be the set of diagrams, generated by concatenating and nesting a. 
Claim 1: 

Wa{u) = (1 - Wa{u)-\u^a{ufry^. 

To prove Claim 1 we consider a Vcr-diagram. Clearly, its maximal arcs are contained 
in i > 1 copies of cr. These arcs induce exactly (2m — l)t cr-intervals, in each of which 
we find again an element of Va , whence 



V.(u) = ^(u"V<,(u)2™-l)* 
t>0 



and Claim 1 follows. 

Let La be the set of diagrams having the fixed shape a obtained by infiating a-arcs 
into stacks, or symbolically, U x SeqCU). Here U and Jl = SeqCU) denote the classes 
of arcs and sequences of arcs. Clearly, the associated generating function of U x is 
u{l - u)"^ 

Note that each £o--diagram contains exactly (2m — 1) a-intervals and an arbitrary 
number of pairs of P-intervals. Let 3'a denote the set of diagrams generated by con- 
catenating and nesting £o-diagrams that contain no empty P-intervals. Let finally Wo- 
be the set of 1-canonical diagrams, having shapes in 3'a- 

Claim 2. 



w.(.) =i-w.(.) ^Y-^^r(WfH 



l-T^(W2(^.)-l) ' ■ 

We shall construct Wo- using arcs, IL, sequences of arcs, Jl, induced arcs, K, and se- 
quence of induced arcs, M. The class S'o- is obtained by concatenating and nesting 
/Ccr-diagrams that do not contain any empty P-intervals, see Fig. [S] 

i.e. an arc together with at least one nontrivial 9'o-diagram in either one or in both 
P-intervals 



X = U X ((Ja - 1) + (J<T - 1) + (3^^ - 1)^) ^Ux(jl-lj 



8 



Clearly, we have for a single induced arc N(ti) = u {Faiu)^ — 1^ and for a sequence of 
induced arcs, M = Seq(!N"), where 

M(ti) = 



l-u{¥a{uY ~ !)■ 



By construction, the maximal arcs of an 3"a-diagram coincide with those of its under- 
lying V(j-diagram. Therefore 

= ^((llxM)™J^'"-i)' 
t>o 

with generating function 



t>o 



Next we inflate the arcs of the 3"o-diagram into stacks, IX x CR. 
This inflation process generates Wcr-diagrams and any Wo-diagram can be con- 
structed from a unique fixed irreducible shadow a of genus g with m arcs. We have 

w.w-|:(( i_^,^,..)._, )"'w,w— (4, 

whence Claim 2. 



Claim 3: Let M be the set of irreducible shadows of genus g < 7. Then 

WmW- - 1 - E E ( /74t(M "' . (5) 

The maximal arcs of a V^v/ -structure, partition into the maximal arcs of t concatenated 
irreducible shadows cti , . . . , (jj and 

E 1 = (e E • (6) 

{"1 ''f} \g<7l<iTi / 



These maximal arcs induce exactly (2m — 1) t cr-intervals. In each a-interval, we find 

' M 



again an element of Vj\/. Thus for any ai having m arcs, we have V^?"\ which 



leads to the term u™Vj\/(u)^™~"'^. It remains to sum over all i.e. expressing all 
the decompositions of V^v/^ -structures into concatenated, irreducible shadows and we 
obtain ^ 

vmW = E (e E^^M'^^vmh^'-m • (7) 

t>0 \s<7m>l / 

The passage to from V^/ to as well as that from to 3"j\/ follows from Claim 
2, whence 



t 
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Here Fm{u) ^ exists in C[[m]], having a nonzero constant term. Next we inflate the 
arcs of the 9^j\f -structure into stacks, obtaining 



wmw- - 1 - e e w,,(.)- ( , ) . (9) 

p<7 Km 



We next derive the functional equation for H'y(u) by incorporating noncrossing 
arcs. Since the maximal arcs composed of noncrossing arcs are exactly rainbows, the 
generating function of 3i-y-diagrams nested in a rainbow is given by uH-y(ti). As in 
Claim 3 we conclude 

H^iu)''^ = 1 " E (^*H^(") + H7H"^ E isM'^M™ ) ' 

g<7 \ m>l / 



where 

^{u) - ^ 



iiH^(ti) 
1-uH2(u)' 



Setting Wu(X) = 1 — uX^, eq. ([TJ gives rise to the polynomial 

p^{u,x) = wu{xr-{~i+x~ ux^)-Y, ^u{xr- ig (Uni^ 



9<7 



(10) 



where k^, = 67 - 2, deg(P^(u,X)) = (2 + 2fi:^), [x'^+'^'^i] Pj{u,X) = -^^1+'*^^ and 
P'y(u, H'y(u)) = 0, whence (a). 

It remains to prove (b). Since M is the finite set o f irreducible shadows of genus 
jr < 7 and any such shadow has 2g < m < arcs lAndersen et al.l ( 2011tJ ), any 
M-shadow has < K-y arcs. Setting v{u) = 1 — uH^(u), eq. ^ implies 

and consequently 



+ E v{un 



tiH^(it) 



(11) 



All coefficients of 'R.yiu) in the RHS of eq. are polynomials in u of degree > 1, 

whence any [2:"]H'y(ii) for n > {n^ + 1) can be recursively computed. Accordingly, 
eq. pip determines H.y{u) uniquely. 
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4 Analysis of 1- and 2-matchings 

Let us begin recalling the following algebraic fact: 

Lemma 1 Let R be an integral domain and (p : R — > R a homomorphism of R into 
an integral domain R. We consider the induced homomorphism ip: R[X] — > R[X]. 
Suppose f — CLiX^ £ R[^] primitive and o^T 7^ 0. Then 

f is irreducible / is irreducible. (12) 

Proof Suppose / is not irreducible in 7i[X]. Since / is primitive, we then have neces- 
sarily a decomposition f = gh where g,h £ R[X] with deg((?) > 1 and deg(/i) > 1. 
Applying tp we obtain 

l=gh (13) 

and an 7^ guarantees deg((?) = deg(^) > 1 and deg(/i) = deg(/i) > 1. Since / is by 
assumption irreducible, f ~ "gh leads to a contradiction. Consequently, / has to be 
irreducible in R[X]. 

Lemma 2 Pi{u,X) and P2{u,X) are irreducible in R[X]. 
Proof According to Theorem [T] we have 

P^{u,X)=Wu{X)''-<{-l + X -uX"^) - wu{X)''-< Ig 

where Aeg{P^{u,X)) = (2 + 2k^), [X^+^'^-'j P^{u,X) = -u^+''-' and 
P7(u, H'y(u)) = 0. In particular, for 7 = 1: 

n 

Pi(«,X)=^PijX"-^ 
i<o 

5 -trio . , n ivS, A 3 v7 r, 

= — u X -\- u X + in X — 4u X — 2u X 
+ 6u^X^ - 3u^X^ - AuX^ + 3uX'^ + X ~ 1. 

For 7 = 2 we obtain 

n 

P2{u,X) = J2p2,^^"~' 

3<0 

11 v-22 I 10 v-21 I n 10 v-20 m 9 vrl9 or 9 -i^lS , ,r 8^17 
— — u X + u X +'du X — lOttA — ibu X + Abu X 

+ 75 it^X^^ - 120 uX^^ - 90 uX^^ + 210 u^X^^ + 21 u^X^^ 

- 252u^X" - 16u^X^" + 210m'^X^ - IQTu'^X^ - 120u^X^ 

+ 75 u^X^ + 45 u^X^ - 35 u^ X'^ - 10 uX'^ + 9 uX"^ +X -1. 

Let R — R/{u — 1) and consider the ring homomorphism R[X] — > R[X] X i->- 
X + {u — 1). Clearly, R = Z,, whence R and R are both integral domains. Since 
WuiX) ^1-X^ in R[X] 5^ Z[X], 

P-yiX) = {1 - X'^)'^-' {-I + X - X'^) -^{1- X'^T'' Ig 



uX^\ 
\wu{X) ' 



X^ 

(1-X2) 
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is a primitive polynomial in where [X^^^*^] Pj{X) = 1 7^ 0. In particular, for 

7 = 1, 2 we derive 

Ti{X) = -^X^° + 3X^ -4X'' -2X^ +6X^ -3X^ ~'iX^ 

+ 3X^ +X - 1 

and 

- 120 -90X^^ + 210X^^ + 21X^^ - 252X" - 16 X^° 
+ 210X^ - 107 - 120X^ + 75X^ +45X^ - 35 X* 
+ 9X^ +X - 1. 

Using Maple we can verify that Pi{X), P2{X) £ Z[X] are both irreducible. 

Since Pi(u, X), P2{u, X) are primitive and [X^+^^^j P^(X) = 1/0, Lemma [T] 
implies that Pi{u, X), P2{u, X) are irreducible in i?[X]. 

Theorem 2 Let i — 1,2, and Dp^f^^ discriminant of Pi{u, X) and let Hi 

denote the real dominant singularity of Hj (u) . 

(a) the dominant singularity fii is unique and a root of Dp^(^^ 

(b) at fii we have 

Hi(M) = TT, + ^ a„,, ((/i, - u)5)", (14) 
ri>l 

where tTj is the root of minimal modulus of of Pj(/ij, X) and j 7^ 0. 

(c) the coefficients of'H.i{u) are asymptotically given by 

nH,(u) ~fc,n-3/2 (^ri)" (15) 

for some fcj > 0, where /i^^ ~ 8.28425 and /i^^ ~ 9.8724, respectively. 

Proof In order to pro ve (a) , we observe that Lemma [5] allows us to employ The orem 
12.2.1 of iHillj (|l962l '). pp. 103. According to eq. (12.2.16) and eq. (12.2.17) of IHiU j 
(|l962l ). the discriminant Dp.{u) can be expressed as 



Dp.(u) = (-1)5"("-1) _L^Res 



P 2fL 

" dX 



(16) 



where Res j^Pj, denotes the resultant of Pi and as polynomials in X. The 

resultant can be computed via a certain determinant iHillj (|l962h and we verify by 
direct computation that the dominant singularity is the root of minimal modulus of 

To prove (b), let tt^ denote the unique real root of minimal modulus of Pi{fii,X). 
Then 

Ri{u,Xi) = _Rj(/ij - M,X - TTi) = Pi{u,X) = 

represents in the variables {^i — u) and (X — t t? ) a p lane curve with a singularity at the 
origin, respectively. Puiseux's Theorem I Walll ( 20041 ') guarantees a solution of X — tt; in 
terms of a power series in fractional powers of (^^ — u). 
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Claim, A solution of Pi{u,X) = at u = /i^ is given via a Puiseux series of the 
form 

X = iTi + ^^ ((Mj - w)^)", 
n>l 

where ai^j ^ 0. 

To prove the Claim, we use Lemma [2] and explicitly compute the coefficients 
[u'^jXj?] Ri{ui,Xi) and R{ui,Xi). In particular, 



R{u,,X,) + 0. 



By construction, the curve Ri(ui, Xi) — has a singularity at the origin, whence 

[ulxl^ Rr{u,,X,) = 11^(0,0) = 0. 

Constructing the Puiseux series via the Newton polygon, we find the first exponent of 
ui to be 1/2 and furthermore 

[utx]:'\ R^iu„X,) ai^, + [ulxfj R{u,,X,) = 0. (17) 

Combining this observation with Theorem 2.1.1, pp 15-16. IWalll ( 20041 '). we derive that 

1 /2 

there exists a powerseries in that satisfies Pi{u,X) = at u = /Xj. 

According to Theorem [T] Hj(z) is the unique solution of Pi{u,X) = 0, which ties 
the above Puiseux series to H.i(?i), i.e. 

Hi(M) =7r, + ^a„,, ((ai, -«)5)". (18) 

n>l 

Assertion (c) follows from e g. (1181) as a straightforward app lication of the transfer 
theorem. Theorem VI. 3, pp. 389 iFlaiolet and Sedee-i^ (|2009f l. 



5 Combinatorics of 7-diagrams 
Lemma 3 For any 7 > 1, we have 



„ , N 1 + II „ ( u(l + u) \ , , 

' ' l + 2u~ue ^ \(1 + 2u - ue)^ J ^ ' 

Proof We first prove 

iij{x,y) = ^ ( I . (20) 

x + l-yx ^\{x + l-yxyj ^ ' 

Choose ^ G CK'y(s + l,?n + 1) and label one of its 1-chords. Since we can label any 
of the (m + 1) 1-arcs of ^, (m +1) h'y(s + 1, m + 1) different such labeled linear arc 
diagrams arise. On the other hand, to produce ^ with this labeling, we can add one 
labeled 1-arc to an element of I}{-y(s, m + 1) by inserting a parallel copy of an existing 
1-arc or by inserting a new labeled 1-arc in an element of IK'y(s,m), where we may 
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only insert the 1-arc between two vertices not already forming a 1-arc. It follows that 
we have the recursion 

(m + 1) h'y(n + 1, »n + 1) = (m + 1) h^f{n, m + 1) + (2n + 1 — m) h-y(n, m) 

or equivalently the PDE 

^ ^an^ _^ dU,{x,y) ^ xy^^^^, (21) 

oy ay ox ay 

which is thus satisfied by 'il^{x, y). 
On the other hand, 

HX(x,y) = — — i H 



X + 1 — yx \{x + 1 — yx)^ 

is also a solution of eq. ()21|) . which specializes to H-y(x) = H^(a;, 1), and moreover, 
we have h^(n,m) = [a;"'j/'"]H^(x, t/) = 0, for m > n. Indeed, the first assertion is 
easily verified directly, the specialization is obvious, and the fact that y only appears 
in the power series H;^(a;, y) in the form of products xy implies that hij,(n, m) = 0, for 
m > n. Thus, the coefficients h^(n, m) satisfy the same recursion and initial conditions 
as h'y(n,m), and hence by induction on n, we conclude h^(7i, m) = \\^{n,m), for 
n, m > 0. This proves that H'y(n,m) indeed satisfies eq. (|20[) as was claimed. 

To complete the proof of eq. ()19|) . we use that the projection i? is surjective and 
affects neither irreducible shadows nor the number of 1-arcs. Let us consider a fixed 
7-shape, A, having s arcs, of which t are 1-arcs and the generating function H;^(a;,i/), 
counting 7-matchings that project into A. Then 

H^(a;,y) = y*' 

which shows that 'Hij[x,y) depends only on the total number of arcs and number of 
1-arcs in A. Consequently, 

n,{x,y) = J2Y1 ^^(^''") (t^^)'?^" = (t3^'^) • (22) 

s>0 ni=0 

Setting u = i.e., x = and e = y, we arrive at 



l + 2u-ue \{l + 2u-uey 
as required. 

Lemma 4 Let \ be a fixed j-shape with s > 1 arcs and m > 1-arcs. Then the 
generating function of t -canonical '^-diagrams containing no 1-arc that have shape A 
is given by 



G'T,7V 



In particular, G^.^(z) depends only upon the number of arcs and 1-arcs in A. 



Our main result about enumerating r-canonical 7-structures follows. 
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Table 1 The exponential growth rates of ^, for 1 < s,i < 2. 



is,i) (1,1) 


(2,1) 


(1,2) 


(2,2) 


p-] 3.6005 


2.2759 


3.8846 


2.3553 



Theorem 3 Suppose 7, r > 1 and let Ut{z) — J'yr ■ Then the generating function 

Gr,7(2) is algebraic and given by 



In particular for 1 < s,i < 2 we have 
for some constants kg^i > 0, for p^] , we have Tah\^ 



Ut{z)z'^ 



(23) 



-l\n 



Proof Since each 7-diagram has a unique 7-shape, A, having some number m > of 
1-arcs, we have 

Gr,7(^) E ^r,-t{z). (24) 



1>0 A 7-ahapo 
- having m 1-a 



According to Lemma |4l G^^^{z) only depends on the number of arcs and 1-arcs of A, 
and we can therefore express 



Gr^z) = 



z~l 



(1-22)(1_2)2_(2z-z2)^2t 



0- - z) + Ur{z)z^ ^ \ - z) + Ur{z)z'^f ) ' 



Z Ur{z) 



using Lemma [3] in order to confirm eq. H26|l , where tire second equality follows from 
direct computation. Let 



r{z) 



lr{z) 



{{1-Z)+Ur{z)z^y 



denote the argument of H^y in this expression. By definition we have 9{z) G C{z). 
Since 6a-{0) ~ the composition 'H.^f{6{z)) is welldefined as a powerseries. Obviously, 
Pj{z,tl-f{z)) — guarantees P-y(Sr(2:), H-y(6r(2)) = 0. We have the following Hasse 
diagram of fields 

C{z,er{z),H.y{er{z))) 



C{z,er{z)) 




C{z,-aj{z)) 
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fro m which we immediately concl ude that Gr.-fiz) is algebraic. Pringsheim's Theo- 
rem [piajoler^an^^Sedgewicy (I2OO9I I guarantees that for any 7, r > 1, Gr.'yiz) has a 
dominant real singularity pr,7 > 0. 
According to Theorem [2] we have 

Hi(z) = 1"! + ^ aj_i (^{/li - z)^^'^^ and [z"]Hj(z) ~ ki n^^^^ (^*^^) ' 

For r = 1, 2, we verify directly that pi^i and P2,i are the unique solutions of minimum 
modulus of 9i{z) — m and 6*2(2) ~ fJ-i- These solutions are strictly smaller than any 
other singularities of 9i{z) and ^2(2) and furthermore satisfy O'lipi^i) 7^ as well as 
^2(^2,1) 7 ^ 0. It follows that G-\ i jz) a nd G2 i{z) are governed by the supercritical 
paradigm iFlaiolet and SedeewiclJ ( 2009h , which in turn implies 

[z"]G,,,(2)~fc,,,n-3/2(p-l)" (25) 

where s = 1, 2 and kg^i is some positive constant. 

Theorem |3] has its analogue for r-canonical, 7-diagrams containing 1-arcs. The 
asymptotic formula in case of t = 1, 7 = 1, 

is due to Nebel and Weinberg! ( 2011al ') who used the explicit grammar developed in 
iReidvs et al.l l|201ll 'l in order to obtain an algebraic equation for Gi.i(z). 

Corollary 1 Suppose 7, r > 1 and let Ut{z) = Ji^ l^i^i ■ Then the generating func- 
tion of T -canonical ^1 -diagrams containing 1-arcs, Gt,^{z), is algebraic and 

■t{z)z 

(l-z) 



Gr,,{z) = ( i^fii^ ) . (26) 



In particular for y = 1 we have 

[z"]Gia(2)~jin-i(erj)", and [z^]G2,i{^) ^ j2n-Hg^^lr 
for some constants ji,j2, where Qi \ = 3.8782 and f?2^]^ = 2.3361. 

Proof Let A be a fixed 7-shape with s > 1 arcs and m > 1-arcs. Then the generating 
function of r-canonical 7-diagrams containing 1-arcs that have shape A containing 
1-arcs is given by 

_i / z^r y 

Gr,y{z) = {l-Z) (^(^_^2)(i_^)2_(22_^2)^2r j ' 
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6 Discussion 



The symbolic approach based on 7-matchings allows not only to compute the gener- 
ating function of canonical 7-structures. On the basis of Theorem [3] it is possible to 
obtain a plethora of statistics of 7-structures by means of combinatorial markers. 

For instance, we can analogously compute the bivariate generating function of r 
canonical 7-structures over n vertices, containing exactly m arcs, AT,-f{z,t) as 



r{z,t)z^ - Z + 1 



■■iz,t) 



{ur(z,t)z^ - z + iy 



(27) 



where UT[z,t) is given by 



UT{z,t) 



t (tz 



2\T-1 



(te2)T 



tz^ + 1 



This bivariate generating function is the k ey to ob t ain a central limit theorem for the 
distribution of arc- numbers in 7-struc t ures ^ 3endeil 1 1973h on the basis of Levy-Cramer 
Theorem on limit distributions iFelleil (1991). 

Statistical properties of 7-st ructures play a ke y role for qu a ntifyi n g algorithmic 



impro vements via sparsifications iBusch et all (|2008h ; iMohl et all (l20ld') 



^ ■ ^ Wexler et al. 

2007). The key property here is the polymer-zeta vrovertv iKabakcioelu an^'steUa 
200s); iKafri et al.l ( 2000l ) which states that the probability of an arc of length I is 



bounded by kl'^, where k is some positive constant and c > 1. Polymer-zeta stems from 
the theory of self-avoiding walks TVanderzandel (|l998h and has only been empirically 
established for the simplest class of RNA structures, namely those of genus zero. It turns 
out however, that the polymer-zeta property is genuinely a combinatorial property of 
a structure class. Moreover our resu lts allow to quantify the effect of sparsificati ons of 
folding algorithms into ^-structures I Andersen et al.l ( 2011bl ): iHuang and Reidvsi 

We finally remark that around 98% of RNA pseudoknot structures catalogued in 
databases are in fact canonical 1-structures. RNA pseudoknot structures like the HDV- 
viru^ exhibiting irreducible shadows of genus two are relatively rare. 
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